

Subscribe (Full Service) Register (Limited Sei Search: • The ACM Digital Library • The +vector +dsp +loop +register simd simdd

Feedback Report a problem

Published since January 1990 and Published before October 2003

Terms used vector dsp loop register simd simdd

Sort results by

Felevance

Save results to a Binder

Search Tips

Try an Advanc

Try this search

Open results in a new window

Results 1 - 20 of 200

Result page: 1 2 3 4 5 6 7 8 9 10 ne

Best 200 shown

Ro

- 1 Energy aware compilation for DSPs with SIMD instructions
- Markus Lorenz, Lars Wehmeyer, Thorsten Dräger

June 2002 ACM SIGPLAN Notices, Proceedings of the joint conferenc compilers and tools for embedded systems: software and cor embedded systems LCTES/SCOPES '02, Volume 37 Issue 7

**Publisher:** ACM Press

Full text available: pdf(220.77 Additional Information: full citation, abst KB) citings, index ten

The growing use of digital signal processors (DSPs) in embedded system use of optimizing compilers supporting special hardware features. In this compiler optimizations with the aim of minimizing energy consumption applications: This comprises loop optimizations for exploitation of SIMI zero overhead hardware loops in order to increase performance and decreconsumption. In addition, we use a phase coupled code generator ...

**Keywords**: DSP, SIMD instruction, energy minimization, vectorization, hardware loop

Exploiting SIMD parallelism in DSP and multimedia algorithms using the

Huy Nguyen, Lizy Kurian John

May 1999 Proceedings of the 13th international conference on Superco

**Publisher:** ACM Press

Full text available: Dpdf(1.16 Additional Information: full citation, refer MB) index terms

3 Simulation and architecture evaluation: Vector vs. superscalar and VLIW a embedded multimedia benchmarks

Christoforos Kozyrakis, David Patterson

November 2002 Proceedings of the 35th annual ACM/IEEE internation **Microarchitecture MICRO 35** 

**Publisher:** IEEE Computer Society Press

Full text available: pdf(1.34

Additional Information: full citation, abst MB) 🗐 citings, index ten

**Publisher Site** 

Multimedia processing on embedded devices requires an architecture tha performance, low power consumption, reduced design complexity, and s this paper, we use EEMBC, an industrial benchmark suite, to compare th architecture to superscalar and VLIW processors for embedded multimed comparison covers the VIRAM instruction set, vectorizing compiler, and that integrates a vector processor with DRAM main memory. We de ...

4 MOM: a matrix SIMD instruction set architecture for multimedia application

▲ Jesus Corbal, Roger Espasa, Mateo Valero

January 1999 Proceedings of the 1999 ACM/IEEE conference on Super (CDROM) Supercomputing '99

**Publisher:** ACM Press

Full text available: pdf(116.12 Additional Information: full citation, reference and reference are represented by the pdf(116.12).

5 Register file and memory system design: Three-dimensional memory vector bandwidth media memory systems

Jesus Corbal, Roger Espasa, Mateo Valero

November 2002 Proceedings of the 35th annual ACM/IEEE internation

## **Microarchitecture MICRO 35**

Publisher: IEEE Computer Society Press

Full text available: pdf(1.29

MB) Additional Information: <u>full citation</u>, <u>abst</u>

Publisher Site index terms

Vector processors have good performance, cost and adaptability when ta applications. However, for a significant number of media programs, conconfigurations fail to deliver enough memory references per cycle to fee functional units. This paper addresses the problem of the memory bandw novel mechanism suitable for 2-dimensional vector architectures and targhigh effective bandwidth for SIMD memory instructions. The basi ...

6 MEDEA workshop: Indirect VLIW memory allocation for the ManArray n

Nikos P. Pitsianis, Gerald G. Pechanek

March 2003 **ACM SIGARCH Computer Architecture News**, Volume 3 **Publisher:** ACM Press

Full text available: pdf(623.03 Additional Information: full citation, abst KB) index terms

The indirect very long instruction word (iVLIW) architecture and its imp BOPS ManArray family of multiprocessor digital signal processors (DSI scalable alternative to the wide instruction busses usually required in a m VLIW DSP. The ManArray processors indirectly access VLIWs from sn VLIWs localized in each processing element. With this work, we present perform 1) iVLIW instruction memory allocation on multiple processing

7 Array recovery and high-level transformations for DSP applications

Björn Franke, Michael O'boyle

May 2003 ACM Transactions on Embedded Computing Systems (TEC 2

**Publisher:** ACM Press

Full text available: pdf(744.35 Additional Information: full citation, abst KB) citings, index ten

Efficient implementation of DSP applications is critical for many embed Optimizing compilers for application programs, written in C, largely foc generation and scheduling, which, with their growing maturity, are provi returns. As DSP applications typically make extensive use of pointer arit

alternative use of high-level, source-to-source, transformations has been This article develops an array recovery technique that automatically ...

**Keywords**: Pointer conversion, dataflow graphs, embedded processors, l transformations

- 8 Regular contributions: DSP architectures: past, present and futures
- Edwin J. Tan, Wendi B. Heinzelman

June 2003 ACM SIGARCH Computer Architecture News, Volume 31 Publisher: ACM Press

Full text available: pdf(1.27 MB) Additional Information: full citation, abst

As far as the future of communication is concerned, we have seen that th for audio and video data to complement text. Digital signal processing (I that enables traditionally analog audio and video signals to be processed transmission, storage, reproduction and manipulation. In this paper, we v various DSP architectures and its silicon implementation. We will also d the art and examine the issues pertaining to pe ...

- 9 Code selection for media processors with SIMD instructions
- Rainer Leupers

January 2000 Proceedings of the conference on Design, automation and DATE '00

**Publisher:** ACM Press

Full text available: Politicle of Site Additional Information: full citation, reference index terms

Publisher Site

10 Compilers and Optimization: An empirical evaluation of high level transfo

embedded processors

Björn Franke, Michael O'Boyle

November 2001 Proceedings of the 2001 international conference on Coarchitecture, and synthesis for embedded systems CAS

**Publisher:** ACM Press

Full text available: pdf(499.08 Additional Information: full citation, abst KB)

KB) index terms

Efficient implementation of DSP applications are critical for many embe Optimising compilers for application programs written in C, largely focu generation and scheduling which, with their growing maturity, are provide returns. This paper empirically evaluates another approach, namely high source transformations. High level techniques were applied to the DSPst 3 platforms: TriMedia TM-1000, Texas Instruments TMS320C6201 and

11 Evaluating MMX technology using DSP and multimedia applications
Ravi Bhargava, Lizy K. John, Brian L. Evans, Ramesh Radhakrishnan
November 1998 Proceedings of the 31st annual ACM/IEEE internation
Microarchitecture MICRO 31

**Publisher:** IEEE Computer Society Press

Full text available: Pdf(1.52 Additional Information: full citation, reference MB)

Additional Information: full citation, reference index terms

**Keywords**: MMX, digital signal processing, machine measurement, perf monitoring, workload characterization

12 Exploiting a new level of DLP in multimedia applications

Jesus Corbal, Mateo Valero, Roger Espasa

November 1999 Proceedings of the 32nd annual ACM/IEEE internation Microarchitecture MICRO 32

Publisher: IEEE Computer Society

Full text available: Pdf(931.68 KB) Additional Information: full citation, abst citings, index ten

This paper proposes and evaluates MOM: a novel ISA paradigm targeted applications. By fusing conventional vector ISA approaches together wit SIMD-like (Single Instruction Multiple Data) ISAs (such as MMX), we new matrix oriented ISA which efficiently deals with the small matrix st found in multimedia applications. MOM exploits a level of DLP not reacconventional vector ISAs nor SIMD-like media ISA extensi ...

13 HIBRID-SOC: a multi-core architecture for image and video applications

S. Moch, M. Bereković, H. J. Stolberg, L. Friebe, M. B. Kulaczewski, A. I. September 2003 ACM SIGARCH Computer Architecture News, Proce workshop on MEmory performance: DEaling with Apand architecture MEDEA '03, Volume 32 Issue 3

**Publisher:** ACM Press

Full text available: Pdf(245.38 KB) Additional Information: full citation, abst

The HiBRID-SoC multi-core architecture targets a wide range of applica particularly high processing demands, including general signal processin video de-encoding, image processing, or a combination of these tasks. For HiBRID-SoC integrates three fully programmable processor cores and value a single chip, all tied to a 64-Bit AMBA AHB bus. Its memory subsystem adapted to the high bandwidth demands of the multi-core a ...

- 14 Graph-based code selection techniques for embedded processors
- October 2000 ACM Transactions on Design Automation of Electronic 5 (TODAES), Volume 5 Issue 4

**Publisher:** ACM Press

Full text available: pdf(356.83 Additional Information: full citation, abst KB) index terms, review

Code selection is an important task in code generation for programmable the goal is to find an efficient mapping of machine-independent intermed processor-specific machine instructions. Traditional approaches to code on tree parsing which enables fast and optimal code selection for intermed a set of data-flow trees. While this approach is generally useful in compi purpose processors, it may lead to poor code ...

**Keywords**: SIMD instructions, code selection, data-flow graphs, embedirregular data paths

- 15 Compilers I: Affinity-based cluster assignment for unrolled loops
- Gayathri Krishnamurthy, Elana D. Granston, Eric J. Stotzer
  June 2002 Proceedings of the 16th international conference on Superco

**Publisher:** ACM Press

Full text available: pdf(633.13 Additional Information: full citation, abst KB) citings, index ten

To compete performance-wise, modern VLIW processors must have fast high instruction-level parallelism (ILP). Partitioning resources (functional registers) into clusters allows the processor to be clocked faster, but oper clusters can easily become a bottleneck. Increasing the number of function the potential ILP, but only helps if the functional units can be kept busy. features, optimizations such as loop unrolling m ...

**Keywords**: VLIW architectures, affinity-based clustering (ABC) algoritl assignment, homogeneous clusters, loop optimizations, loop scheduling, partitioned register files, software pipelining

16 Trident: a scalable architecture for scalar, vector, and matrix operations Mostafa I. Soliman, Stanislav G. Sedukhin

January 2002 Australian Computer Science Communications, Proceed Asia-Pacific conference on Computer systems architectur Volume 24 Issue 3

Publisher: Australian Computer Society, Inc., IEEE Computer Society Pre Full text available: pdf(814.51 Additional Information: full citation, abst citings, index ten

Within a few years it will be possible to integrate a billion transistors on this integration level, we propose using a high level ISA to express paral instead of using a huge transistor budget to dynamically extract it. Since data structures for a wide variety of applications are scalar, vector, and n Trident processor extends the classical vector ISA with matrix operation processor consists of a set of paralle ...

**Keywords**: data parallelism, parallel processing, ring register file, scalat vector/matrix processing

17 Low power DSP's for wireless communications (embedded tutorial session

Ingrid Verbauwhede, Chris Nicol

August 2000 Proceedings of the 2000 international symposium on Low

# and design ISLPED '00

**Publisher:** ACM Press

Full text available: Pdf(424.32 Additional Information: full citation, abst KB) citings, index ten

Wireless communications and more specifically, the fast growing penetr phones and cellular infrastructure are the major drivers for the developm programmable Digital Signal Processors (DSPs). In this tutorial, an over of recent developments in DSP processor architectures, that makes them execute computationally intensive algorithms typically found in commun DSP processors have adapted instruction sets, memory archi ...

**Keywords**: architectures, digital signal processing, programmable proce communications

18 Improving 3D geometry transformations on a simultaneous multithreaded 5

Claude Limousin, Julien Sebot, Alexis Vartanian, Nathalie Drach-Temam June 2001 Proceedings of the 15th international conference on Superco Publisher: ACM Press

Full text available: pdf(219.10 Additional Information: full citation, abst KB) citings, index ten

In this paper we evaluate the performance of an SMT processor used as t processor for a 3D polygonal rendering engine. To evaluate this approach PMesa (a parallel version of Mesa) which parallelizes the geometry stage. We show that SMT is suitable for 3D geometry and we characterize the geometry stage in term of memory hierarchy, which is the main bottlened that latency is not fully recovered by SMT; the use of L2 ...

**Keywords**: SIMD extensions, SMT, applications specific architectures, data prefetching, parallel rendering

19 <u>HiBRID-SoC: A Multi-Core System-on-Chip Architecture for Multimedia Applications</u>

Hans-Joachim Stolberg, Mladen Berekovic, Lars Friebe, Soren Moch, Seba Mao, Mark B. Kulaczewski, Heiko Klusmann, Peter Pirsch

March 2003 Proceedings of the conference on Design, Automation and

# Designers' Forum - Volume 2 DATE '03

Publisher: IEEE Computer Society Full text available: pdf(307.90

KB) Additional Information: <u>full citation</u>, <u>abst</u>

Publisher Site

The HiBRID-SoC multi-core system-on-chip targets a wide range of app particularly high processing demands, including general signal processin video and audio de-/encoding, and a combination of these tasks. For this HiBRID-SoC integrates three fully programmable processors cores and vonto a single chip, all tied to a 64-Bit AMBA AHB bus. The processor coptimized to the particular computational characteristics ...

20 Polygon rendering on a stream architecture

John D. Owens, William J. Dally, Ujval J. Kapasi, Scott Rixner, Peter Mati August 2000 Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Graphics hardware HWWS '00

**Publisher:** ACM Press

Full text available: pdf(161.65 Additional Information: full citation, abst KB) citings, index ten

The use of a programmable stream architecture in polygon rendering pro mechanism to address the high performance needs of today's complex sc need for flexibility and programmability in the polygon rendering pipelin a polygon rendering pipeline maps into data streams and kernels that ope how this mapping is used to implement the polygon rendering pipeline o programmable stream processor. We compare our resul ...

**Keywords**: OpenGL, SIMD, graphics hardware, kernels, media processor rendering, stream architecture, stream processing, streams

Results 1 - 20 of 200

Result page: 1 2 3 4 5 6 7 8 9 1

The ACM Portal is published by the Association for Computing Machinery ACM, Inc.

Terms of Usage Privacy Policy Code of Ethics Contact

Results (page 1): +vector +dsp +loop +register simd simdd

Page 10 of 10

Useful downloads: Adobe Acrobat QuickTime Windows Med
Player



Welcome United States Patent and Trademark Office

Home | Logii

#### □ Search Results

**BROWSE** 

**SEARCH** 

**IEEE XPL** 

Results for "(((vector simd )<in>metadata)) <and> (pyr >= 1990 <and> pyr <= 2003)"
Your search matched 5 of 1472243 documents.
A maximum of 100 results are displayed, 25 to a page, sorted by Relevance in Descending order.

» Search Options

View Session History

Modify Search

View Session History ///vector simd \<in>metadata\) <and> (nvr >= 1990 • New Search Check to search only within this results set » Key Display Format: © Citation C Citation & Abstract IEEE Journal or **IEEE** JNL Magazine view selected items Select All Deselect All IEE Journal or **IEE JNL** Magazine A SIMD vectorizing compiler for digital signal processing algorithms **IEEE Conference IEEE** Franchetti, F.; Puschel, M.; **CNF** Proceeding Parallel and Distributed Processing Symposium., Proceedings Internationa **IEE CNF IEE Conference** 15-19 April 2002 Page(s):20 - 26 Proceeding Digital Object Identifier 10.1109/IPDPS.2002.1015494 **IEEE** AbstractPlus | Full Text: PDF(381 KB) IEEE CNF IEEE Standard STD Rights and Permissions A survey of parallel computer architectures Duncan, R.; Computer

Computer
Volume 23, Issue 2, Feb. 1990 Page(s):5 - 16
Digital Object Identifier 10.1109/2.44900

AbstractPlus | Full Text: PDF(800 KB) IEEE JNL
Rights and Permissions

3. Short vector code generation and adaptation for D

3. Short vector code generation and adaptation for DSP algorithms
Franchetti, F.; Puschel, M.;
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '0'.
Volume 2, 6-10 April 2003 Page(s):II - 537-40 vol.2
Digital Object Identifier 10.1109/ICASSP.2003.1202422

AbstractPlus | Full Text: PDF(366 KB) IEEE CNF
Rights and Permissions

4. Architecture independent short vector FFTs
 Franchetti, F.; Karner, H.; Kral, S.; Ueberhuber, C.W.;
 <u>Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '0</u>
 Volume 2, 7-11 May 2001 Page(s):1109 - 1112 vol.2
 Digital Object Identifier 10.1109/ICASSP.2001.941115
 <u>AbstractPlus</u> | Full Text: <u>PDF(312 KB) IEEE CNF</u>
 Rights and Permissions
 5. OPSILA: a vector and parallel processor

5. OPSILA: a vector and parallel processor
Boeri, F.; Auguin, M.;
Computers, IEEE Transactions on
Volume 42, Issue 1, Jan. 1993 Page(s):76 - 82
Digital Object Identifier 10.1109/12.192215

AbstractPlus | Full Text: PDF(572 KB) IEEE JNL

Rights and Permissions

indexed by Inspec\*

Home | Login | Logor



# Welcome United States Patent and Trademark Office

#### □ Search Results

BROWSE SEARCH GUID

Results for "(((vector simdd dsp)<in>metadata)) <and> (pyr >= 1990 <a

Your search matched 0 documents. A maximum of 100 results are displayed, 25 to a page, sorted by Relevance Descending order.

## » Search Options

View Session History

New Search

» Key

HEEE IEEE

Journal or Magazine

IEE Journal

NL or Magazine

HEEF IEEE

Conference Proceeding

IEE IEE

Conference Proceeding

IEEE IEEE STD Standard

#### **Modify Search**

(((vector simdd dsn)<in>metadata)) <and> (nvr >= 1

☐ Check to search only within this results set

**Display** © Citation Citation & Abstract

### No results were found.

Please edit your search criteria and try again. Refer assistance revising your search.

Indexed by

Home | Login | Logor



## Welcome United States Patent and Trademark Office

#### □ Search Results

BROWSE SEARCH HITT

Results for "(((vector simd dsp)<in>metadata)) <and> (pyr >= 1990 <ar Your search matched 0 documents. A maximum of 100 results are displayed, 25 to a page, sorted by Relevance Descending order.

# » Search Options

View Session History

New Search

## Modify Search

 $(((vector simd dsn) \le metadata)) \le and > (nvr >= 19)$ 

☐ Check to search only within this results set

Format: Citation & Citation & Abstract

» Key

**IEEE** 

Journal or Magazine

**IEE Journal** 

or Magazine

IEEE IEEE

Conference **Proceeding** 

EFF

IEE

Conference Proceeding

IEEE Standard

# No results were found.

Please edit your search criteria and try again. Refer assistance revising your search.

Indexed by inspec\*



Welcome United States Patent and Trademark Office

Home | Logii

□ Search Results

**BROWSE** 

**SEARCH** 

**IEEE XPL** 

Results for "(((vector access patterns)<in>metadata)) <and> (pyr >= 1990 <and> pyr <= 200..."
Your search matched 1 of 1472243 documents.

A maximum of 100 results are displayed, 25 to a page, sorted by Relevance in Descending order.

**Modify Search** 

» Search Options

View Session History New Search » Key IEEE IEEE Journal or JNL Magazine IEE Journal or **IEE JNL** Magazine IEEE **IEEE Conference** CNF Proceeding IEE CNF **IEE Conference** Proceeding

IEEE Standard

(((vector access natterns)<in>metadata)) <and> (nv)
□ Check to search only within this results set
Display Format: © Citation © Citation & Abstract
view selected items
Select All Deselect All
□ 1. Introducing a new cache design into vector computers Quing Yang;
Computers, IEEE Transactions on

Volume 42, Issue 12, Dec. 1993 Page(s):1411 - 1424

AbstractPlus | Full Text: PDF(1276 KB) IEEE JNL

Digital Object Identifier 10.1109/12.260632

**Rights and Permissions** 

Indexed by

inspec'

IEEE

STD